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Abstract 

We study the mixing time of systematic scan Markov chains on finite spin systems. It is 
known that, in a single site setting, the mixing time of systematic scan can be bounded in 
terms of the influences sites have on each other. We generalise this technique for bounding the 
mixing time of systematic scan to block dynamics, a setting in which a (constant size) set of 
sites are updated simultaneously. In particular we consider the parameter a, corresponding to 
the maximum influence on any site, and show that if a < 1 then the corresponding systematic 
scan Markov chain mixes rapidly. As applications of this method we prove O(logn) mixing of 
systematic scan (for any scan order) for heat-bath updates of edges for proper g-colourings of 
a general graph with maximum vertex-degree A when q > 2 A. We also apply the method to 
improve the number of colours required in order to obtain mixing in 0(log n) scans for systematic 
scan for heat-bath updates on trees, using some suitable block updates. 

1 Introduction 

This paper is concerned with the study of finite spin systems. A spin system is composed of a set 
of sites and a set of spins, both of which will be finite throughout this paper. The interconnection 
between the sites is determined by an underlying graph. A configuration of the spin system is an 
assignment of a spin to each site. If there are n sites and q available spins then this gives rise to q n 
configurations of the system, however some configurations may be illegal. The specification of the 
system determines how the spins interact with each other at a local level, such that different local 
configurations on a subset of the graph may have different relative likelihoods. This interaction hence 
specifies a probability distribution, tt, on the set of configurations. One class of configurations that 
receive much attention in theoretical computer science is proper q-colourings of graphs. A proper 
colouring is a configuration where no two adjacent sites are assigned the same colour. One important 
example of a spin system is when the set of legal configurations is the set of all proper q-colourings 
of the underlying graph and tt is the uniform distribution on this set. In statistical physics the spin 
system corresponding to proper (/-colourings is known as the g-state anti-ferromagnetic Potts model 
at zero temperature. 
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Sampling from ir is a computationally challenging task. It is, however, an important one and is 
often carried out by simulating some suitable random dynamics on the set of configurations. Such 
a dynamics must have the following two properties 

1. the dynamics eventually converges to tt, and 

2. the rate of convergence (mixing time) is polynomial in the number of sites. 

It is generally straightforward to ensure that a dynamics converges to tt but much harder provide 
good upper bounds on the rate of convergence, which is what we will be concerned with in this 
paper. 

Arguably the simplest dynamics is the heat-bath Glauber dynamics which, at each step, selects 
a site uniformly at random and updates the spin assigned to that site by drawing a new spin from 
the distribution on the spin of the selected site induced by n. This procedure is repeated until the 
distribution of the Markov chain is sufficiently close to it using some suitable measure of closeness 
between probability distributions. This dynamics falls under a family of Markov chains that we call 
random update Markov chains. We say that a Markov chain is a random update Markov chain if 
the sites are updated in a random order. This type of Markov chain has been frequently studied in 
theoretical computer science and much is known about the mixing time of various random update 
Markov chains. 

An alternative to random update Markov chains is to construct a Markov chain that cycles 
through (and updates the spin according to the induced distribution) the sites (or subsets of sites) 
in a deterministic order. We call this a systematic scan Markov chain (or systematic scan for short). 
Although systematic scan updates the sites in a deterministic order it remains a random process 
since the procedure used to update the spin assigned to a site is randomised, as specified by the 
appropriate induced distribution. Systematic scan may be more intuitively appealing that random 
update in terms of implementation, however until recently little was know about the convergence 
rates of this type of dynamics. It remains important to know how many steps one needs to simulate 
a systematic scan for in order to for it to become sufficiently close to its stationary distribution and 
recently there has been an interest among computer scientists in investigating various approaches 
for analysing the mixing time of systematic scan Markov chains, see e.g. Dyer, Goldberg and 
Jerrum [5], [7] and Bordewich, Dyer and Karpinski pj. In this paper we present a new method 
for analysing the mixing time of systematic scan Markov chains, which is applicable to any spin 
system. As applications of this method we improve the known parameters required for rapid mixing 
of systematic scan on 

1. proper colourings of general graphs and 

2. proper colourings of trees. 

A key ingredient in our method for proving mixing of systematic scan is to work with a block dynam- 
ics. A block dynamics is a dynamics in which we allow a set of sites to be updated simultaneously 
as opposed to updating one site at a time as in the description of the Glauber dynamics above. 
Block dynamics is not a new concept and it was used in the mid 1980s by Dobrushin and Shlosman 
[4] in their study of conditions that imply uniqueness of the Gibbs measure of a spin system, a topic 
closely related to studying the mixing time of Markov chains (see for example Weitz's PhD thesis 
|16|). More recently, a block dynamics has been used by Weitz [17] when, in a generalisation of the 
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work of Dobrushin and Shlosman, studying the relationship between various influence parameters 
(also in the context of Gibbs measures) within spin systems and using the influence parameters to 
establish conditions that imply mixing. Using an influence parameter to establish a condition which 
implies mixing of systematic scan is a key aspect of the method presented in this paper as we will 
discuss below. Dyer, Sinclair, Vigoda and Weitz [8] have also used a block dynamics in the context 
of analysing the mixing time of a Markov chain for proper colourings of the square lattice. Both 
of these papers consider a random update Markov chain, however several of ideas and techniques 
carry over to systematic scan as we shall see. 

We will bound the mixing time of systematic scan by studying the influence that the sites of the 
graph have on each other. This technique is well-known and the influence parameters generalised 
by Weitz [TT]: "the influence on a site is small" (originally attributed to Dobrushin [3]) and "the 
influence of a site is small" (originally Dobrushin and Shlosman [4]) both imply mixing of the 
corresponding random update Markov chain. It is worth pointing out that a condition of the form 
"if the influence on a site is small then the corresponding dynamics converges to ir quickly" is known 
as a Dobrushin condition. In the context of systematic scan, Dyer et al. [5] point out that, in a 
single site setting, the condition "the influence on a site is small" implies rapid mixing of systematic 
scan. Our method for proving rapid mixing of systematic scan is a generalisation of this influence 
parameter to block dynamics. 

We now formalise the concepts above and state our results. Let C = {1, . . . , q} be the set of 
spins and G = (V, E) be the underlying graph of the spin system where V = {1, . . . , n} is the set of 
sites. We associate with each site i £ V a positive weight Wi. Let + be the set of all configurations 
of the spin system and Vt C Q+ be the set of all legal configurations. Then let tt be a probability 
distribution on + whose support is O i.e., {x £ £l + \ ir(x) > 0} = Q. If x G f2 + is a configuration 
and j E V is a site then Xj denotes the spin assigned to site j in configuration x. For each site 
j £ V, let Sj denote the set of pairs (x, y) £ $7+ x il + of configurations that only differ on the spin 
assigned to site j, that is xi = yi for all i ^ j. 

We will use Weitz's [17] notation for block dynamics, although we only consider a finite collection 
of blocks. Define a collection of m blocks G = {&k}k=i,...,m such that each block ©^ C V and 
covers V, where we say that covers V if UfcLi ©fc = V. One site may be contained in several 
blocks and the size of each block is not required to be the same, we do however require that the 
size of each block is bounded independently of n. For any block 0& and a pair of configurations 
x, y £ £l + we write "x = y on if x^ = yi for each i £ Q k and similarly "x = y off if x^ = yi 
for each i £ V \ Q k . We also let dO k = {i £ V \ ®k \ 3j G ©a,. : {i,j} £ E(G)} denote the set of 
sites adjacent to but not included in ©&; we will refer to dQ k as the boundary of ©&. 

With each block ©&, we associate a transition matrix on state space f2 + satisfying the 
following two properties: 

1. If P^(x,y) > then x = y off ©&, and also 

2. 7r is invariant with respect to P^. 

Property [U ensures that an application of P^ moves the state of the system from from one configu- 
ration to another by only updating the sites contained in the block ©& and Property [2] ensures that 
any dynamics composed solely of transitions defined by P^ converges to tt. While the requirements 
of Property Q] are clear we take a moment to discuss what we mean in Property [2l Consider the 
following two step process in which some configuration x is initially drawn from tt and then a con- 
figuration y is drawn from P^- k \x) where P^ k \x) is the distribution on configurations resulting from 
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applying P^ to a configuration x. We than say that tt is invariant with respect to P^ if for each 
configuration a G Sl + we have Pr(x = a) = Pr(y = a). That is the distribution on configurations 
generated by the two-step process is the same as if only the first step was executed. In terms of our 
dynamics this means that once the distribution of the dynamics reaches tt, tt will continue be the 
distribution of the dynamics even after applying P^ to the state of the dynamics. Our main result 
(Theorem [2]) holds for any choice of update rule P^ provided that it satisfies these two properties. 

The distribution P^ k \x), which specifies how the dynamics updates block clearly depends 
on the specific update rule implemented as P^. In order to make this idea more clear we describe 
one particular update rule, known as the heat-bath update rule. This example serves a dual purpose 
as it is a simple way to implement P^ and we will make use of heat-bath updates in Sections [3] 
and H] when applying our condition to specific spin systems. A heat-bath move on a block given 
a configuration x is performed by drawing a new configuration from the distribution induced by 
tt and consistent with the assignment of spins on the boundary of The two properties of P^ 
hold for heat-bath updates since (1) only the assignment of the spin to the sites in 0& are changed 
and (2) the new configuration is drawn from an appropriate distribution induced by tt. If the spin 
system corresponds to proper colourings of graphs then the distribution used in a heat-bath move 
is the uniform distribution the set of configurations that agree with x off and where no edge 
containing a site in is monochromatic. 

With these definitions in mind we are ready to formally define a systematic scan Markov chain. 

Definition 1. We let M.^ be a systematic scan Markov chain with state space fl + and transition 
matrix P_ = UT=i p[k] - 

The stationary distribution of Ai-> is tt as discussed above, and it is worth pointing out that 
the definition of M.^, holds for any order on the set of blocks. We will refer to one application of 
P_^ (that is updating each block once) as one scan of M.^. One scan takes Ylk updates and it 
is generally straight forward to ensure, via the construction of the set of blocks, that this sum is of 
order 0(n). 

We will be concerned with analysing the mixing time of systematic scan Markov chains, and 
consider the case when M.^ is ergodic. Let M. be any ergodic Markov chain with state space 
f2 + and transition matrix P. By classical theory (see e.g. Aldous [Tj) M. has a unique stationary 
distribution, which we will denote tt. The mixing time from an initial configuration x G Q + is 
the number of steps, that is applications of P, required for M. to become sufficiently close to tt. 
Formally the mixing time of M. from an initial configuration x G Q, + is defined, as a function of the 
deviation e from stationarity, by 

Mix x (A4,e) = min{t > : d TY (P t {x, ■)> *"(■)) < e} 

where 

drv(0i, 9 2 ) = IJ2 |0i(») - e 2 (i)\ = max - 9 2 (A)\ 

is the total variation distance between two distributions 6\ and 6 2 on + . The mixing time 
Mix(A4,e) of M. is then obtained my maximising over all possible initial configurations 

Mix(A4,e) = max Mix x (M , e) . 

x&n+ 

We say that A4 is rapidly mixing if the mixing time of M is polynomial in n and log(e _1 ). 
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We will now formalise the notion of "the influence on a site" in order to state our condition for 
rapid mixing of systematic scan. For any pair of configurations (x,y) let *fc(x,y) be a coupling 
of the distributions P^(x) and P^ k \y) which we will refer to as "updating block ©&". Recall 
that a coupling ^k(x,y) of P^ k \x) and P^ k \y) is a joint distribution on Q + x Q + whose marginal 
distributions are P^ k \x) and P^ k \y). That is 

V<7 G Q + Pr xlePl k] ix) (x' = a) = ?*{x',y')e* k {x,y){ x ' = cr ,V > = T ) 

and 

Vr G n + Pr y , ep[k](y) (y' = a) = ^ Pr^^y^y^x' = a, y' = r) 

where we write (x',y') G ^k{x, y) when the pair of configurations (x',y') is drawn from ^(x, y). 
Weitz in [17] states his conditions for general metrics whereas we will use Hamming distance, which 
is also how the corresponding condition is defined in Dyer et al. [5]. This choice of metric allows 
us to define the influence of a site i on a site j under a block Qk, which we will denote p k j, as 
the maximum probability that two coupled Markov chains differ at the spin of site j following an 
update of 8^ starting from two configurations that only differ at the spin on site i. That is 

{ x >y)t=&i 

Then let a be the total (weighted) influence on any site in the graph site defined by 

w i k 

a = max max > — p.- ,-. 

* Ha ^ a-, ''' 

i J 

We point out that our definition of p\ ■ is not the standard definition of p used in the literature (see 
for example Simon [H] or Dyer et al. [5]) since the coupling ^k(x,y) is explicitly included. In the 
block setting it is, however, necessary to include the coupling directly in the definition of p as we 
will discuss in Section [5] In Section we also show that the condition a < 1 is a generalisation 
of the corresponding condition in Dyer et al. [5] in the sense that if each block contains exactly 
one site and the coupling minimises the Hamming distance then the conditions coincide. Our main 
theorem, which is proved in Section [2j states that if the influence on a site is sufficiently small then 
the systematic scan Markov chain A^_> mixes in O(logn) scans. 

Theorem 2. Suppose a < 1. Then 

, . j s logfne -1 ) 

Mix(M->,e) < — -. 

1 — a 

As previously stated we will apply Theorem [2] to two spin systems corresponding to proper 
g- colourings of graphs in order to improve the parameters for which systematic scan mixes. In both 
applications we restrict the state space of the Markov chains to the set of proper colourings, of 
the underlying graph. Firstly we allow the underlying graph to be any finite graph with maximum 
vertex-degree A. Previously, the least number of colours for which systematic scan was known to 
mix in O(logn) scans was q > 2 A and when q = 2 A the best known bound on the mixing time 
was 0(n 2 log n) scans due to Dyer et al. [5]. For completeness we pause to mention that the least 
number of colours required for rapid mixing of a random update Markov chain is q > 11/6A due 
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Table 1: Optimising t 



A 

~3~ 

4 
5 
6 
7 
8 
9 

10 



_h_ 
15" 

3 

12 

3 

7 

13 

85 
5 



"X" 

1 

n 
j£ 
ii 
l 

2 

HI 

23 



19 



ne number 
/(A) 
5 
7 

8 

10 
11 
12 
13 
15 



of colours using blocks 
[A + 



9 

11 

12 

14 

15 

16 



to Vigoda [15]. In Section [3] we consider the following Markov chain, edge scan denoted -M e dgei 
updating the endpoints of an edge during each update. Let = {&k}k=i....,m be a set of edges in 
G such that covers V . Using the above notation, P^ is the transition matrix for performing a 
heat-bath move on the endpoints of the edge and the transition matrix of -M e dge is njJL-jP^. 
We prove the following theorem, which improves the mixing time of systematic scan by a factor 
of n 2 for proper colourings of general graphs when q = 2A and matches the existing bound when 
q > 2A. 

Theorem 3. Let G be a graph with maximum vertex-degree A. If q > 2A then 

Mix(A^ed ge ,e) < A 2 log^e- 1 ). 

Next, in Section[4j we restrict the class of graphs to trees. It is known that single site systematic 
scan mixes in 0(log n) scans when q > A + 2y/ A — 1 and in 0(n 2 log n) scans when q = A + 2\/ A — 1 
is an integer; see e.g. Hayes [11] or Dyer, Goldberg and Jerrum [6J. More generally it is known that 
systematic scan for proper colourings of bipartite graphs mixes in O(logn) scans when q > 1.76A 
as A — ► oo due to Bordewich et al. [2]. Again, for completeness, we mention that the mixing time 
of a random update Markov chain for proper colourings on a tree mixes in 0(n log n) updates when 
q > A + 2, a result due to Martinelli, Sinclair and Weitz [13], improving a similar result by Kenyon, 
Mossel and Peres [32]. We will use a block approach to improve the number of colours required 
for mixing of systematic scan on trees. We construct the following set of blocks where the height 
h of the blocks is defined in Table [U Let a block contain a site r along with all sites below r 
in the tree that are at most h — 1 edges away from r. The set of blocks covers the sites of the 
tree and we construct such that no block has height less than h. P^ is the transition matrix for 
performing a heat-bath move on block 0& and the transition matrix of the Markov chain .Mtree is 
n™ =1 p[ fe ] where m is the number of blocks. We prove the following theorem. 

Theorem 4. Let G be a tree with maximum vertex-degree A. If q > /(A) where /(A) is specified 
in Tabled for small A then 

Mix(Mree,£) = O (log^- 1 )) . 

We conclude the paper with a discussion, in Section [51 of the influence parameter a and how it 
relates to the corresponding parameters for the "influence on a site" in Weitz [17] and Dyer et al. 
[5]. In particular we will show that the condition in Weitz [TJ] does not imply mixing of systematic 
scan and that the condition in Dyer et al. [5] is a special case of our condition from Theorem [2l 
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2 Bounding the Mixing Time of Systematic Scan 



This section will contain the proof of Theorem [2l The proof follows the structure of the proof from 
the single-site setting in Dyer et al. [5], which follows Follmer's [9] account of Dobrushin's proof 
presented in Simon's book [T4] . 

We will make use the following definitions. For any function / : J7 + — > M>o let Si(f) = 
max^ x ^(z Sj \f(x) — f(y)\ and A(/) = J2i^v w A(f)- Also for any transition matrix P define (Pf) 
as the function from ft + to M>o given by (Pf)(x) = ^2 x i P(x,x')f(x'). Finally let li^e fc be the 
function given by 

otherwise. 



We can think of 5i(f) as the deviation from constancy of / at site i and A(f) as the aggregated 
deviation from constancy of /. Now, Pf is a function where (Pf)(x) gives the expected value of 
/ after making a transition starting from x. Intuitively, if t transitions are sufficient for mixing 
then P t f is a very smooth function. An application of P^ fixes the non-constancy of / at the sites 
within 0fc although possibly at the cost of increasing the non-constancy at sites on the boundary 
of 0fc. Our aim is then to show that one application of P_^ will on aggregate make / smoother i.e., 
decrease A(/).We will establish the following lemma, which corresponds to Corollary 12 in Dyer et 
al. [5], from which Section 3.3 of [5] implies Theorem [2j 

Lemma 5. If a < 1 then 

A(P_>/) < qA(/). 

We begin by bounding the effect on / from one application of P^. The following lemma is a 
block-move generalisation of Proposition V.1.7 from Simon |14j and Lemma 10 from Dyer et al. [5]. 

Lemma 6. S^f) < l^eMf) + E; 6 e fc ftj&iU) 

Proof. Take ^(x\y')€9 k (x,y)) [f( x> )] to be the the expected value of fix') when a pair of configurations 
(x',y f ) are drawn from ^k(x,y). Since ^k(x,y) is a coupling of the distributions P^ k \x) and P^(y), 
the distribution P^(x) and the first component of ^(x, y) are the same and hence 

V(x',y>)e* k (x,y) [f(x')] = E x , ePlk][x) [f{x')] (1) 

and the same fact holds for the distribution p[ fc l (y) so 

B (x',y')^ k (x, y ) [f(y')] = SyePWd/) [/(f0] ■ ( 2 ) 
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Using (pQ), ([2]) and linearity of expectation we have 



Si{P [k] f) 



< 



< 



max 

x,y)eSi 



max 

x,y)&Si 



(pWf)(x)-(pWf)(y)\ 

£ (*, x')f{x') - P [k] (y, y')f(y') 



max \E x , ePlk]{x) [f(x')] - E y , ep[k]{y) [f{y')] 

\ E (x>,y')€V k (x,y)) [f( x ')] ~ E (x'y)eM> fc (x,y) [/(?/)] | 

x,y)eSi 

max | E (a;'y)e*fe^,y) ~ /(fO] I 

x,y)eSi 

max E (a: , y)^ (a . „) - /(y')|] 

x,y)eSi 



max E 



. . . a£j/J +1 ■ • • y'n) - f( x 'i ■ ■ ■ x 'j-iy'j ■■■y'n) 



m ? x o Yl B (x',y')^ k (x,y) [\f( x 'i ■ ■ ■ x 'jy'j+i ■■■y'n)- /Oi • • • x 'j-\y'j ■ ■ -y'JW ■ 

Notice that x = x' off and y = y' off 

We need to bound the expectation ^(x', y ')^ k (x, y ) f(x[ . . . Xjy' j+1 ...y' n )- f(x[ . . . x'j^y'j ■■■y' n ) 
for each site j G V. There are three cases. 

• j € 0fc. By definition of pf • the coupling will yield x'- 7^ y'j with probability at most p% • and 



so 



E (^'y)G* fc (x,y) • • • ^1^+1 • • • J/n) - /K • • • 4-1^ • • • y '^\] 



< 



pl j max{\f(a)-f(T)\} = pl j 6 j (f). 

(cr,T)eSj 



• J 0fc and j = i. Since j 0^ we have Xj = x'j and yj = y'j so 

E (x',y')^ k (x,y) [\f( x 'l ■ ■ ■ x 'jy'j+l ■■■y'n)- f( x l • • • • • • 2/n)|] < = 

• J 0fc and j / j. In this case we have Xj = a^- and t/j = y'- which implies x'a = so 

E (*v)e* fc (*,v) [\f( x 'i ■ ■ ■ x Wj+i ■■■y'n)- f( x 'i ■ ■ ■ x 3-\y'j ■■■y'n)\] = o. 

Adding it up we get the statement of the lemma. 

We will use Lemma [6] in conjunction with an inductive proof similar to (V.1.16) in Simon |14j 
in order to establish the following lemma. It is important to note at this point that the result in 
Simon is presented for single site heat-bath updates, whereas the following lemma applies to any 
block dynamics (satisfying the stated assumptions) and weighted sites. This lemma is also a block 
generalisation of Lemma 11 in Dyer et al. [5]. 



□ 
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Lemma 7. Let T(k) = |jf=i ©z then for any k G {1, . . . , m} ? if a < I then 

A(pW...pWf)<a*£ wMf)+ E u, ^<(/)« 
igr(fe) iev\r(fc) 

Proof. Induction on k. Taking k = as the base case, we get the definition of A. 
Assume the statement holds for k — 1. 

A(pW...pW/)<« £ ^(p[ fc l/)+ £ wMP [k] f) 
ier(fc-i) iev\r(fc-i) 

<a E h?e k Wi5i(f) + a E E w iPi,j 6 i(f) 
ier(fc-i) ier(fc-i)jee fe 

+ E 1 ^e fc ^i(/)+ E E^iW 
iev\r(fc-i) «ey\r(A:-i)iee fe 

by Lemma El 

Simplifying and using a < 1 

A(pw.-.pW/)<« £ «**(/)+ E YtMMf) 

ier(fc-i)\e fc ier(fc-i)iee fc 

+ E w Mf)+ E Y, w *dMf) 

iev\r{k) iev\r{k~i) jee fc 

= a E + E 

ier(fc-i)\e fc iev\r(fc) 

+ E w) E + E 

jee fc \ier(A;-i) iev\r(fc-i) / 

= a e ^^(/)+ E w Mf)+ E ^c/oE^A- 

ier(fe-i)\e fc iev\r(fc) jee fc iev 

<a E + E ^^(/) + E 5 i(/) m f x E^ 

ier(&-i)\e fe tevAr(fe) jee fe iev 

<a E WiSi(f)+ E ^(Z) +« E w 3 5 Af) 
ier(fe-i)\e fc iev\r(fc) jee fc 

= a E ^(/) + E ^ACf) 
ier(fc) iev\r(fc) 

by definition of a. 

Lemma [5] is now a simple consequence of Lemma [7] since 

A(P_/) = A(PW • • • PN/) < a J>ifc(/) = «A(/) 
and Theorem [2] follows as discussed above. 
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3 Application: Edge Scan on an Arbitrary Graph 



In this section we prove Theorem 02 That is, we present a general version of a systematic scan on 
edges and use Theorem [2] to prove that it mixes in O(logn) scans when q > 2A. We use Wi = 1 
for all i S F and so omit all weights throughout this section. Recall that -M e <j ge is the systematic 
scan Markov chain with transition matrix H™ =1 P^ where = {®k}k=i,...,m is an ordered set of 
edges in G that covers V and P^ is the transition matrix for performing a heat-bath move on the 
endpoints of the edge ©&. 

We need to construct a coupling VPfc(:r, y) of the distributions P^ k \x) and P^ k \y) for each pair 
of configurations (x, y) £ Si that differ only at the colour assigned to site i. Assume without loss of 
generality that Xi = 1 and yi = 2 and also let j and f be the endpoints of the edge ©&. Recall that, 
since the dynamics uses heat-bath updates, P^- k \x) is the uniform distribution on configurations 
that agree with x off ©& and where no edge containing j or f is monochromatic. For ease of notation 
we let Di = P^(x) and D 2 = P^ k \y). We go on to make the following definitions for I G {1, 2} and 
s £ 0^. Di(s) is the distribution of the colour assigned to site s induced by D;, and [Di | s = c] is 
the uniform distribution on the set of colourings of the sites in ©& where site s is assigned colour 
c. We also let d\ denote the number of configurations with positive measure in Di and d^ s=c be the 
number of configurations that assign colour c to site s and have positive measure in D\. 

Definition 8. We will say that the choice c\C2 is "valid" for D[ if there is a configuration with 
positive measure in Di in which site j is coloured c\ and site j' is coloured C2. Similarly a colour c 
is "valid" on a site s in Di if there exists a valid choice for Di where site s is coloured c. 

3.1 Overview of the Coupling 

We begin the construction of the coupling by giving an overview of the cases we will 

need to consider and show that they are mutually exclusive and exhaustive of all configurations. 
It is important to note that, by definition of p, the coupling we define may depend on the initial 
configurations x and y in the sense that if two pairs of configurations (xi,y±) and (£2,2/2) can be 
distinguished then the couplings ^k( x i,Ui) and ^(a^,^) may be defined differently. 

First, if i is not adjacent to any site in that is i <90fc, then ^h{ x i V) is the identity coupling 
where the same colouring is assigned to each distribution. Hence, for i dQk and j £ we have 

Pid = o- 

Now suppose that i is adjacent to at least one site in that is i £ dQk- We consider the 
following five cases, which by construction are exhaustive of all possible configurations and mutually 
exclusive. In the diagrams that relate to these cases a dotted line between a site j £ &k and a colour 
1, say, denotes that no site adjacent to j on the boundary of 0^ (other than possibly i) is coloured 
1. A full line denotes that some site adjacent to j on the boundary of (other than possibly i) 
is coloured 1. The full details of each case of the coupling will be given in section f3T2l along with 
bounds on p k j and p k ■, where j and j' are the sites included in ©&. 

1. Exactly one site in ©& is adjacent to i. Let this site be labeled j and let the other site in ©^ 
be labeled j' . This is shown in Figure [H 

2. Both sites in ©& are adjacent to i and no other sites in dQk are coloured 1 or 2. The labeling 
of the sites in ©& is arbitrary. This is shown in Figure [2j 
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Figure 1: Case[H Exactly one site in 0& is adjacent to i. Let this site be labeled j and let the other 
site in 0& be labeled j' . 




i = 1/2 



Figure 2: Case [21 Both sites in are adjacent to i and no other sites in dQk are coloured 1 or 2. 
The labeling of the sites in 0^ is arbitrary. 

e fc 




i = 1/2 



Figure 3: Case [HI Both sites in are adjacent to i. One of the sites in 0^ is adjacent to at least 
one site, other than i, coloured 1. Let this site be labeled j' . The other site in 0& is labeled j and 
it is not adjacent to any site, other than i, coloured 1 or 2. 




Figure 4: Case[H Both sites in 0^ are adjacent to i. One of the sites in 0^ is adjacent to at least 
one site, other than i, coloured 1 and no sites that are coloured 2. Let this site be labeled f. The 
other site in labeled j, is adjacent to at least one site other than i coloured 2 and no sites 
coloured 1. 



Ok 




Figure 5: Case[5j Both sites in 0& are adjacent to i and at least one site, other than i coloured 1. 
The labeling of the sites in 0^ is arbitrary 




11 



Figure 6: Case[Q Exactly one site in 0& is adjacent to i. Let this site be labeled j and let the other 
site in 0& be labeled j' . 



3. Both sites in 6^ are adjacent to i. One of the sites in 0& is adjacent to at least one site, other 
than i, coloured 1. Let this site be labeled j' . The other site in 0& is labeled j and it is not 
adjacent to any site, other than i, coloured 1 or 2. This is shown in Figured 

4. Both sites in are adjacent to i. One of the sites in 0& is adjacent to at least one site, other 
than i, coloured 1 and no sites that are coloured 2. Let this site be labeled j 1 . The other site 
in Ofc, labeled j, is adjacent to at least one site other than i coloured 2 and no sites coloured 
1. This is shown in Figure HI 

5. Both sites in are adjacent to i and at least one site, other than i coloured 1. The labeling 
of the sites in is arbitrary. This is shown in Figure[5l 

3.2 Details of Coupling and Proof of Mixing 

We will now give the full details of each case of the coupling and establish the required bounds on 
the influence of site i on sites j and j' . The following lemma is required to establish the coupling 
for all the stated cases. 

Lemma 9. Let j and j' be the endpoints of an edge ©/% and suppose that {i,j} £ E{G). Then for 
each pair of colours ci, C2 G C\{1, 2} the choice c\C2 is valid for D\ if and only if c\Ci is valid for D2. 

Proof. We start with the if direction. Suppose c\C2 is valid in D2 then no site adjacent to j has 
colour ci in D2 and since c\ 7^ 1 no site adjacent to j has colour c\ in D\. Also no site adjacent to 
j' has colour C2 in D2 hence no site adjacent to j' has colour C2 in D\ since cfc ^ 1. Since c\C2 is 
valid in D2 c\ / C2 and so c\C2 is valid in D\. 

The only if direction is similar. Suppose c\C2 is valid in D\ then no site adjacent to j has colour 
ci in D\ and since c\ 7^ 2 no site adjacent to j has colour c\ in L>2- Also no site adjacent to j' has 
colour C2 in D\ hence no site adjacent to j' has colour C2 in D2 again since C2 7^ 2. Since c\C2 is 
valid in D\ c\ / C2 and so c\C2 is valid in D2. □ 

Details of case [TJ. (Repeated in Figured]) We construct a coupling ^(x, y) of the distributions 
D\ and D2 using the following two step process. Let ipj be a coupling of D\{j) and -D2O) which 
greedily maximises the probability of assigning the same colour to site j in each distribution. Then, 
for each pair of colours (c, c') drawn from ipj, ^(x, y) is a coupling, minimising Hamming distance, 
of the conditional distributions D± \ j = c and D2 \ j = d . 

Lemma 10. Let] and j' be the endpoints of an edge 0^. If{i,j} G E(G) and{i,j'} E{G) then 



e 



i = 1/2 





1 



and p i ji < 



1 



q — A 
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Proof. Assume without loss of generality that d\ > d 2 , i.e that there are at least as many valid 
choices for D\ as for D 2 . Since the only site in 0& that is adjacent to site i is j, Lemma 13 of 
Goldberg, Martin and Paterson [10] lets us upper bound the probability of a discrepancy at site j 
in a pair of configurations drawn from the coupling ^fk(x,y) by assuming that j' is assigned the 
worst case colour. Now, 1 is not valid for j in D\ so Lemma [9] implies that only the choice 2 for 
j in D\ would cause site j to be assigned a different colour in each configuration drawn from the 
coupling. Now observe that site j has at most A — 1 neighbours (excluding j') and each of them 
could invalidate one colour choice for j in both distributions. If j' is assigned a colour not already 
adjacent to j then j is adjacent to at most A sites each assigned a different colour, leaving at least 
q — A valid colours for j in D\ and so the probability of assigning 2 to j in D\ during step 1 of the 
coupling is at most since the coupling is greedy. This establishes the bound on since 

& = ( ™^ 5 / p vy)e* fc (^)(^ + Vj)) ^ 

Now from the definition of the coupling it follows easily that if the same colour, c, is assigned to 
site j in each distribution during the first step of the coupling then the colour assigned to site j' in 
the second step will be the same in each distribution since the conditional distributions D\ \ j = c 
and D 2 \ j = c are the same. If different colours are assigned to j in each distribution then the 
second step of the coupling is simply the case of colouring a single site adjacent to exactly one 
discrepancy. The argument from above says that at most one colour assigned to j' in D\ will cause 
a discrepancy at site j' in the coupling and also that there are at least q — A valid choices for j' in 
D x . Hence we have m^ x ^ eS .{¥r( x , jy ,^ k (x,y){x'j> + Vj> I 4 = c,Vj = c')} < and so 

Pi,f = , { P Hx>,yi)^ k (x,y)(x'ji + Vj')} 

(x,y)£bi 

= ( ™ } a ^ ' ^2 Pl (x',y')€<f k (x,y)( x/ j' + Vj> I x'j = C,Vj = c')P*{xi,y>)^ k (x,y){ x 'j = C,Vj = c') 



< 



A (x,y)eSi 

1 



(g-A)2 

using the bound from pfj which completes the proof. □ 

The following lemmas are required to define the coupling and bound the influence of a site 
i £ <96fc on sites j and j' when i is adjacent to both sites j and j' . 

Lemma 11. Let j and j' be the endpoints of an edge and suppose that {i,j} € E{G) and{i,j'} G 
E{G). If 1 is valid for j in D 2 and 2 is valid for j in D\ then the choice 2c 2 is valid in D\ if and 
only if lc 2 is valid in D 2 . 

Proof. Suppose that 2c 2 is valid in D\ then c 2 £ C \ {1,2} since i is adjacent to j' (and Xi = 1). 
Since 1 is valid for j in D 2 it follows that lc 2 is valid in D 2 since the only colour adjacent to j' in 
D 2 that is (possibly) not adjacent to j' in D\ is 2, but c 2 ^ 2. 
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For the reverse direction suppose that lc2 is valid in Z?2- Then C2 E C\ {1, 2} since i is adjacent 
to j' . Since 2 is valid for j in D\ it follows that 2c2 is valid in D\ since the only colour adjacent to 
f in D\ that is (possibly) not adjacent to j 1 in D2 is 1, but C2 / 1. □ 

Lemma 12. Lei j and j' be the endpoints of an edge and suppose that {i,j} E E(G) and 
{i, j'} E E(G). Ifl is valid for j' in D2 and 2 is valid for j 1 in D\ then the choice c\2 is valid in D\ 
if and only if c\l is valid in T>2- 

Proof. Suppose that c\l is valid in D\ then c\ E C \ {1,2} since i is adjacent to j' . Since 1 is 
valid for j' in D2 c\l is valid in D2 since the only colour adjacent to j in D2 that is (possibly) not 
adjacent to j in D\ is 2, but c\ 7^ 2. 

Also, suppose that c\l is valid in D2 then c\ E C \ {1,2} since i is adjacent to j' . Since 2 is 
valid for f in L^ C\2 is valid in -Di since the only colour adjacent to j in D\ that is (possibly) not 
adjacent to j in D2 is 1, but ci 7^ 1. □ 

Lemma 13. Lei j and j' 6e the endpoints of an edge &k and suppose that {i,j} € E(G) and 
{i,j'}eE(G). 

(i) Suppose that 1 is valid for j in L>2- For all c G C where c is valid for j in D2, if I is valid for 
j' in L>2 then 

^2,j=l < ^2,j=c < ^2,j=i + 1 

e/se 

d2j=i — 1 < d2J= c < ^2j'=l- 

fnj Suppose that 2 is valid for j in D\. For all c E C where c is valid for j in D\, ifl is valid for 
j' in D\ then 

di,j=2 < dij =c < d±j=2 + 1 

e/se 

dl,j=2 — 1 < ^l,j=c < d\j=2- 

Proof. Part (i). Consider some valid colour c other than 1 for j in LV For each valid choice lc2 for 
L>2 the choice CC2 is also valid for D2 except when c = C2 . If 1 is valid for j' in D2 then the choice 
cl is also valid for L>2- 

Now consider some invalid choice IC2 for L>2 where C2 7^ 1. Since IC2 is not valid for D2 it follows 
that C2 is not valid for j' in D2 and hence no more choices can be valid for L>2, which guarantees 
the upper bounds. 

Part (ii) is similar. Consider some valid colour c other than 2 for j in D\. For each valid choice 
2c2 for D\ the choice CC2 is also valid for D\ except when c = C2 . If 2 is valid for j' in D\ then the 
choice c2 is also valid for D\. 

Finally consider some invalid choice 2c2 for D\ where C2 / 2. Since 2c2 is not valid for D\ 
it follows that C2 is not valid for j' in D\ and hence no more choices can be valid for L>i, which 
guarantees the upper bounds. □ 

We are now ready to define the coupling for the remaining cases. 

Details of case El (Repeated in Figure [7]) We construct the ^k(x,y) of the distributions D\ 
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Figure 7: Case [2] Both sites in 0^ are adjacent to i and no other sites in d@k are coloured 1 or 2. 
The labeling of the sites in @k is arbitrary. 

e fc 




and D2 as follows. For each valid choice of the form c\C2 for D% where c\ 7^ 2 and C2 7^ 2 Lemma [9] 
guarantees that c\C2 is valid for D2 so we let 

PT (x',y')e* k {x,y)( x ' = V' = c l c 2) = ^T- 

For each valid choice of the form 2c2 in D\ the choice lc 2 is valid in D2 by Lemma [UJ so we let 

P*(x>,y')ey k (x,y)(z' = 2c 2 ,y' = lc 2 ) = (3) 

Lemma [TT] also guarantees that there are no remaining valid choices for D2 of the form IC2. Finally 
for each valid choice ci2 for D\ the choice cil is valid in D2 by Lemma [12] so let 

Pr (a:' > »')e*fc(*,»)( a:;/ = c i 2 ' 2/' = ci 1 ) = ~^ ( 4 ) 

which completes the coupling since d\ = d,2 and all the probability in both D\ and D2 has hence 
been used. 

Lemma 14. Let j and j be the endpoints of an edge 0/% and suppose that {i,j} € E(G) and 
{i,f} G E{G). If 2 is valid for both j and j' in D\ and 1 is valid for both j and j 1 in D2 then 

Pi.i < and p\ A , < ' 



l ' J ~ q-A + 1 ~ q-A' 

Proof. This is case [2] of the coupling. Note from Lemma [TT] that dij= 2 = ^2,^=1 so for ease of 
reference let d = dij=2 = ^2j=i and let d' = rfi,j'=2 = ^2,j'=i by Lemma [T2l Also let s = 

d2j=c — d — d' which is the number of valid choices for D2 other than choices of the form lc 2 
and C]l. Note that the number of valid choices for D\ \s d\ = s + d + d' . 

As there are no restrictions on colours assigned to the sites in d&k \ {i} each of the neighbours 
of j could be assigned a different colour, and the same is true for the neighbours of j' . Hence we 
get the following lower-bounds on d and d': 

q — A < d and q — A < d'. 

To lower bound bound s observe that s = ^ c d2j= c — d — d' = ^2,j=c — df. Let J C C \ {1} 

be the set of colours, excluding 1, that are valid for j in D2. By definition of d' , at least d' colours 
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Figure 8: Case [HI Both sites in 6^ are adjacent to i. One of the sites in 0^ is adjacent to at least 
one site, other than i, coloured 1. Let this site be labeled j' . The other site in &k is labeled j and 
it is not adjacent to any site, other than i, coloured 1 or 2. 




other than 1 must be valid for site j in D 2 so the size of J is at least d! . Since 1 is valid for j' in 
D 2 we use the lower bound on d 2> j =c from Lemma [13] (i) and hence 

s = ^2 d 2 ,j= c ~ d! 
ceJ 

> d' min{d2.7= c } — d! 

> d'd - d'. 

From the coupling, j will be assigned a different colour in each distribution whenever a choice of 
the form 2c 2 is made for D\. From j3|) this happens with probability £ = d+ ^i +s since d is the 
number of valid choices for D\ of the form 2c 2 . Similarly from J4|, j' will become a discrepancy 
in the coupling whenever a choice of the form c\2 is made for D± t which happens with probability 

dW+S- Hence 

P%i ^ JTir-z and Pi,? ^ 



d + d' + s hJ ~ d + d' + s 



< d <^<^< 1 



Starting with 

' Ji ' j ~ d + d' + s ~ d + dd' ~ d' + 1 ~ q-A + 1 
using the lower bounds of s and d' . Similarly using the lower bounds of s and d 

* d ' - I 1 

Pi ' j ' ~ d + d' + s ~ d + dd' ~ d ~ q-A 

which implies the statement of the lemma. □ 

Details of case G2. (Repeated in Figure EI) We construct the coupling ^k(x,y) of D\ and D 2 
using the following two step process. Let tyj be a coupling of D\{j') and D 2 (j') which greedily 
maximises the probability of assigning the same colour to site j' in each distribution. Then for each 
pair of colours (c, c') drawn from we complete ^k(x,y) by letting it be the coupling, greedily 
minimising Hamming distance, of the conditional distributions D\ \ j' = c and D 2 \ j' = d . 

Lemma 15. Let j and j' be the endpoints of an edge O^and suppose that {i,j} G E(G) and 
{i,f} G E(G). If 2 is valid for j in D\, 1 is valid for j in D 2 and 1 is not valid for j' in D 2 then 

i 1 



Pi i' < t and Oa i . . 

■ 3 ~ q-A + 1 1 1 ' - q-A 



16 



Figure 9: The pair of configurations after the colour of site f has been assigned during the first 
step of the coupling. 



f = 2/4 



i = 1/2 



Proof. This is case [3] of the coupling. Note from Lemma [UJ that d±j =2 = ^2j=i and let s = 
X) c ^2,j=c — <hj=\ = Sc^i ^2 ,j=c denote the number of valid choices for D2 other than choices of 
the form lc2. The number of valid choices for T)\ is then d\ = s + d\j = 2 + dij'=2- 

Since 1 is not valid for j' in D2 at least one site other than i on the boundary of 0& must be 
coloured 1 in D\ (we say that some site s on the boundary of 0& is coloured c in D\ if there exists a 
configuration with positive measure in D\ in which site s is coloured c). As there are no restrictions 
on the neighbourhood of j each neighbour of j may be assigned a different colour in D\. Hence we 
get the following lower bounds on dij=2 and dij'=2 

q — A + 1 < di J= 2 and q — A < dij/ =2 . 

To lower bound s observe that exactly <ii,j'=2 colours other than 1 are valid for site j in L>2 and let 
J be the set of colours, excluding 1, that are valid for j in D2, then 

s = S ^d 2 ,j= c > dij'= c min{d 2 j= c } > ^i,i'=2 (d ltj=2 - 1) 

where we used the bound di,j=2 — 1 < ^i,j=2 for c G J from Lemma [TBI (£) since 1 is not valid for 
j'mD 2 . 

We consider p\-i first. Suppose that a choice of the form c\c 2 is valid for D 2 , in which case 
c\ 7^ 2 and C2 {1,2} by the conditions of case [3] of the coupling. Firstly if c\ 7^ 1 then c\c 2 is also 
valid for D\ by Lemma [9l If c\ = 1 then the choice 2c2 is valid for D\ by Lemma [TTJ and hence 
d\> d 2 . Note in particular that if a choice c\c 2 where c 2 7^ 2 is valid for Di then it is also valid for 
D 2 . Therefore, a different colour will only be assigned to site j' in each distribution if j' is coloured 
2 in D\ during the first step of the coupling since the Hamming distance at site j 1 is minimised 
greedily. There are d\ji =2 colourings assigning 2 to j' in D\ and hence 

P k ., < ^ < d -Y^ ^ < -J_ < l - 

,J rfi,i=2 + dij l=2 + s di tj= 2 + (l + dij' =2 ) dij=2 q-A + 1 

using the lower bounds on s and dij =2 . 

Now consider Suppose that ^c^) is the pair of colours drawn for site j' in the first step of 
the coupling. The second step of ^(x, y) then couples the conditional distributions D\ \ j 1 = d x and 
D 2 I j' = c 2 greedily to minimise Hamming distance. First suppose that c[ 7^ c 2 . It was pointed out 
in the analysis above that if c[ 7^ c' 2 then c[ = 2 and the resulting configuration is shown in Figure 
[9j We make the following observations about the resulting conditional distributions D\ \ j' = 2 and 
D 2 \f = c' 2 . 

• The colour 2 is not valid for j in either distribution D\ \ j' = 2 or D 2 \ j' = c 2 . 



17 



• The colour 1 is not valid for j in D\ \ j' = 2 but could be valid for j in D 2 \ j' = c' 2 . 

• The colour c' 2 could be valid for j in Di \ j' = 2 but is noi valid for j in D 2 \ j' = c 2 . 

• For each c £ C \ {1, 2, c' 2 } the colour c is valid for j in D\ \ j' = 2 if and only if c is valid for 
j in D 2 I f = c 2 . 

These observations show that this case is a single-site disagreement sub problem and that there 
must be at least (q — 3) — (A — 2) = q — A — 1 colours that are valid for j in both conditional 
distributions since j has at most A — 2 neighbours other than i and j' . Also, there is at most one 
colour which is valid for j in one distribution but not in the other and since the coupling greedily 
maximises Hamming distance this implies 

F Hx',y')^ k (x,yM + y'j I x 'j> + y'f) ^ 

Now suppose that the same colour c, say, is drawn for site f in both distributions during the first 
step of the coupling. Then the only site adjacent to i that is coloured differently in the conditional 
distributions D\ \ j' = c and D 2 \ j' = c is site i, so using a similar reasoning to above we find 



P vy)e* fc (^)( x i + y'j I 4 = y'j') ^ ^Ta 



and thus 

k 



pi,j = ™f c x Q ( Pi vy)e^(^)(4 + y'j)} 

= max {Pr( x >,y>)ey k ( x ,y)( x 'j ^ y\ \ x' f ^ y^Pr^/y)^^)^, ^ y' jt ) 
{x,y)&Si 

+ Pr (x',y')£y k (x,y)(x'j ¥= y'j I x 'j> = y'j>)Pr(x',y')€y k (x,y)(x'j> = Vj')} 

~ {^~A Pr (-'V)e* fc (-,y)K' + y'j') + ^-TA Pr (x'y)e*,(-,y)(4 = Vf)] 



1 



max {PT(x>,y')ey k (x,y)( x j< y'j') + FT (x',y')^ k (x,y){x'j> = Vj')} 



q — A (x y)is- y ' ( ' r 3 (x ,y i^*k(x,y)y~j' »r'J q — A 

which completes the proof. □ 

Details of case HI (Repeated in Figure [TOj) We assume without loss of generality that d\ > d2 
and construct the coupling x ffc(x, y) of D\ and Z?2 as follows. For each valid choice of the form C1C2 
for D\ where c\ / 1 and C2 / 2 Lemma [9] guarantees that c\Ci is also valid for D 2 so we construct 
y) such that 

Pr^'y^fc^fV = y' = cic 2 ) =J-. 

This leaves the set Z\ = {c\2 \ c\2 valid in D{\ of valid choices for D\ and Z 2 = {IC2 | IC2 valid in D 2 } C 
D 2 for D 2 . Observe that z\ > Z2 where z\ and zi denote the size of Z\ and Z 2 respectively. Let 
Z\ (t) denote the t-th element of Z\ and similarly for Z 2 . Then for 1 < t < z 2 let 

P*(x>,y>)e<S< k {a;,y)(x' = Zi(t),y' = Z 2 (t)) = — 
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Figure 10: Case [H Both sites in are adjacent to i. One of the sites in is adjacent to at 
least one site, other than i, coloured 1 and no sites that are coloured 2. Let this site be labeled f. 
The other site in labeled j, is adjacent to at least one site other than i coloured 2 and no sites 
coloured 1. 



e fe 




and for each pair z 2 + 1 < t < z 2 and h E D 2 let 

PT (x'y')eV k (x, y )(x' = Zi(t),y' = h) = — l — . 

It is easy to verify that each valid colouring has the correct weight in ^k(x,y) so this completes the 
coupling. 

Lemma 16. Let j and j' be the endpoints of an edge and suppose that {i,j} G E(G) and 
{i, j'} G E[G). If 1 is valid for j in D 2 , 1 is not valid for j' in D 2 and 2 is not valid for j in D\ 
then 

k ^ k s 1 

Proof. This is case |4] of the coupling. Let s = ^ c d 2 j =c — d 2y j=\ be the number of valid choices for 
D 2 other than choices of the form lc 2 . Observe that d 2 = s + d±j' =2 and note that d\^y =2 > d 2 j=i 
since we have assumed d\ > d 2 in the construction of the coupling. At least one neighbour, other 
than i, of j' on the boundary of O^ is coloured 1 in D\ and we get the following lower-bound on 
d 2 j=i since all other neighbours of j' may be assigned a different colour 

q- A + 1 < d 2J=1 . 

We bound s using the same argument as in the proof of Lemma [15] and get 

di,j'=2(d 2tj=1 - 1) < s. 

Since 2 is not valid for j in D\ the first d 2 j=± choices of the form c\2 for D\ are matched with 
some choice of the form lei for D 2 with probability l/d\ resulting in a different colour being assigned 
to both sites j and j' in each distribution. Each of the dij> =2 — d 2 j=i remaining valid choices for 
D\ is matched with each valid choice for D 2 with probability resulting in a disagreement at j' 
(since 2 is not valid for j' in D 2 ) and potentially also at at j so pfj < p\-,. Hence the probability 
of making a choice of the form c\l for D\ 

Pr ,v _ 9 \ _ rf 2,j=i dij l=2 - d 2 j=i _ d hjl=2 

PV,«06»*(.,»)(^' - 2) - — + 2^ ^ " ~lk~ 

heD 2 

is an upper bound on the disagreement probabilities at both sites j and f . Using the lower bounds 
on s and d 2 j=i we have 

k ^ k ^ d i,j'=2 _ d ltjl=2 d 1J/=2 1 

PiJ S Piji S : — ^ — — ^ 



di di :j i =2 + s d 1)j > =2 + {d 2 j=i - l)d ljj/=2 q - A 
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Figure 11: Case [5] Both sites in 6^ are adjacent to i and at least one site, other than i coloured 1. 
The labeling of the sites in @k is arbitrary. 




which completes the proof. □ 

Details of case[5l (Repeated in Figure [TT1) First observe that 1 is not valid for both j and f in 
either distribution D\ or D2 so d\ = d 2 + di i j=2 + dij> =2 > d 2 by LemmaOH since any choice valid for 
D2 does not assign colour 2 to any site in Let Z\ and Z2 be the sets of colourings valid for D\ and 
D2 respectively. We define the following mutually exclusive subsets of Z\. Zj = {2c2 | 2c2 G Z\}, 
Zj/ = {c\2 j c\2 G Z{\ and Z = Z\ \ {Zj U Zji) = Z 2 . By construction, the union of these three 
subsets is Z\ and note that the size of Zj is dij =2 , the size of Zji is dij'= 2 and the size of Z is d 2 . 

First we consider choices from Z for D\. For each choice h G Z we have /i 6 by construction 
of Z and so we use the identity coupling and let 

P Hx',y>)e* k (x,y)( x ' = V' = h ) = 

We let the remainder of the coupling minimise Hamming distance. First consider the choices for 
D\ in Zj. We construct ^k(x,y) such that it minimises Hamming distance and assigns probability 
l/d\ to each choice for D\ in Zj whilst ensuring that for each choice g G Z2 for D2 



Pr( x /y) 6 ^ fc ( a!)1/ )(x' = h,y' = g) 



' _ /, ,/ _ _ rf l ,J=2 



Similarly we assign probability l/d\ to each choice for D\ in Zji whilst also requiring that for each 
choice g G Z2 for D2 

V Hx',y')^ k {x,y){x =h,y' = g) = -jf 1 - 

hGZ jt aiCl2 

To see that this ensures that the coupling is fair observe that each choice h G Z\ receives weight 
l/d\ and each choice g G Z2 weight 

J_ dij=2 dij' = 2 _ d2 + d\j=2 + <hj>=2 _ J_ 
di did 2 d\d 2 d\d 2 d 2 

since d 2 + d\ j=2 + j'=2 = ^1 ■ 

Remark. Note that a coupling satisfying these requirements always exists. We will not give the 
detailed construction of ^(x,?/) here, but in the subsequent proof we will consider three cases. In 
the first two cases any coupling minimising Hamming distance will be sufficient to establish the 
required bounds on the influence of i on j. In the final case we will need a detailed construction of 
the coupling and so will provide it together with the proof for ease of reference. 
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Lemma 17. Let j and f be the endpoints of an edge @k and suppose that {i,j} £ D(G) anc 
£ E(G). If 1 is not valid for j in D 2 and 1 is not valid for j' in D 2 then 

k ^ 1 1 J k ^ 1 1 
PU < + 7 a I Tv> and PW < + 



A + l (q - A + l) 2 riJ ~~ q-A + 1 [q - A + l) 2 ' 

Proof. This is case of the coupling. We consider three separate cases. Firstly suppose that 2 is 
not valid for both j and j' in D\. Then the only valid choices for D\ are of the form c\c 2 where 
ci,C2 G C \ {1,2} and each such choice is also valid in D 2 as observed in the construction of the 
coupling. The same colouring is selected for each distribution and hence 

pXj = and p\ jt = 0. 

Next suppose that exactly one site in 0&, j' say, is adjacent to some site coloured 2 in D%. As 
in the previous case, each choice that is valid in both D\ and D 2 is matched using the identity 
matching and does not cause a discrepancy at any site. However if a choice of the form 2c is made 
for D\ then site j will be coloured differently in each colouring drawn from ^^(x,y) and the colour 
at site j' may also be different so p\y < p\j. Since all choices of the form c2 are not valid for D%, 
making a choice of the form 2c for D\ is the only way to create a disagreement at any site in the 
coupling and so 

k < k < d i.i=2 

Pi,j' - Pi,j - dl 

since dij =2 is the number of valid choices for D\ of the form 2c. We need to establish a lower 
bound of d\ and observe that, for c valid for j in D\ 7 dij =2 — 1 < d\j =c by Lemma [T3l (ii) since 2 
is not valid for j' in D\. Let v be the number of colours that are valid for site j in D%. Then v is 
lower bounded hy q — A + 2 < v since at least two of the sites (including i) adjacent to j on the 
boundary of 0^ are coloured 1 in Di. Also, since at least one site (other than j and i) adjacent 
to j' is coloured 1 and another is coloured 2 in D\, we have q — A + 2 < <iij=2- Using the lower 
bounds on v and d\j =c we have, letting J denote the set of colours other than 2 that are valid for 
j in Di , 

di = y~] di tj=c = di J=2 + ^2 di,j=c > dij= 2 + y^(rfi,j=2 - 1) 

c cGJ cGJ 

>(v- l)(di d=2 - 1) + d hj=2 > (q - A + 2)di d=2 -(q-A + 1) 
and hence using the lower bound on dij =2 

1 (q - A + 2)di )i=2 - (g - A + 1) A , q-A + 1 



> , >q-A + 2-- ^-^ > q-A + 1 



P \ A '- d hj=2 -i ' - q-A + 2 



which gives the bounds required by the statement of the lemma. 

Finally consider the case when the colour 2 is valid for both j and j' in D\. In this case we 
will provide details of the construction of ^k(x,y) when required. We begin by establishing some 
required bounds. Since 1 is not valid for j' in D 2 at least two neighbours of j' (including i) must 
be coloured 1 in Di and the same applies to the neighbourhood of j, so we get the following lower 
bounds on dij =2 and d\j>= 2 

q — A + 1 < di j=2 and q — A + 1 < di j/ =2 . 
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We also require bounds on da,j= c and c?2j'=c f° r other colours c. Suppose that the choice cd is 
valid in D 2 then, since c, c' £ C \ {1,2}, od is also valid for D\ by Lemma O Furthermore, 
the choice c2 is valid in D\ (but not D2) so di t j =c — 1 = c?2,j=c- Lemma [13] (ii) guarantees that 
di,j=2 < d 1)j=c < di )j=2 + 1 so 

dl,j=2 — 1 < <^2,j=c < ^l,j=2 

for any c valid for j in D\ and a similar argument gives the bound 

d\j'=2 — 1 < d2j'=c < dij' =c 

for any colour c valid for j' in -D2. Observe that exactly dij/ = 2 colours must be valid for site j in 
D 2 so using the stated bounds on d2j= c we have the following bounds on ^2 

dij'=2(dij=2 — 1) < d 2 < diji = 2dij = 2- 

We bound the probability of disagreements at sites j and j' from choices made for D\. From 
the coupling we again note that if a choice c\C2 where c\ 7^ 2 and C2 7^ 2 is made for £>i then there 
will be no disagreements at any site in B^. 

Consider making a valid choice of the form 2c for D\. Firstly, such a choice for D\ will cause 
site j to be coloured differently in any pair of colourings drawn from the coupling since 2 is not 
valid for j in D2. We construct such that the choice 2c for D\ is matched with a choice of 

the form c'c for D 2 as long as such a choice that has not exceeded it aggregated probability exists. 
Let J denote the set of choices of the form c'c that are valid for D2 and note that the size of J is 
d2,j'=c- The total aggregated weight of all choices of the form c'c for D2 is 



so as long as 



">» dld2 dld2 

geJheZ! g eJ 1 z 1 z 



1 d 2 ,j'= c di,j=2 



d\ d\d 2 

there is enough probability available in Z2 to match all the weight of the choice 2c for D\ with 
a choice of the form c'c for D2 and hence assigning the same colour, c, to site j' in any pair of 
colourings drawn from the coupling. If there is not enough unassigned weight available in Z2 then 
the coupling will match at much probability as possible, d2,J, ^ c ^ 1 ' J ~ 2 , with choices of the form c'c 
for Z2 but the remaining probability will be matched with choices not assigning colour c to site j' 
in Z 2 . Hence we obtain the following probabilities conditioned on making a choice of the form 2c 
for Di. 

p *{x', y >)ey h {x,y)( x j ^ y'j \x' = 2c) = l 

and 

I I I \ ( ^2 j'=cd\ j=2 \ 

PT (x',y')ey k (x,y){Xj' T Vj> \x = 2c) < max 10, 1 ' ^ ' I 

<> fn 1 ( d i,i'=2 - l)di,j=2\ 

< max 10,1 — 

V «l,j=2«l,j'=2 / 

1 

< 

d\j'=2 
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using the bounds on c?2 and d\ji =c . Lastly observe that there are rfij=2 valid choices for D\ of the 
form 2c so 

En I I o A d\j=2 dij =2 

c PW)^^ = 2c) = — = dij . =2 + dij . l=2 + d2 - 

Also consider making a valid choice of the form c2 for D\. This case is symmetric to the 
construction above, but we include it for completeness. A choice of the form c2 for D\ will cause 
site f to be coloured differently in any pair of colourings drawn from the coupling since 2 is not 
valid for j' in D2. We hence construct ^k(x,y) such that it matches the choice c2 for D\ with a 
choice of the form cd for D2 as long as such a choice that has not exceeded it aggregated probability 
exists. Let J' denote the set of choices of the form cd that are valid for D2 and note that the size 
of J' is d2j= c - The total aggregated weight of all choices of the form cd for D2 is 

t 1 l / \ d\ ji=2 d2,j= c di j'=2 



2^ 

geJ' heZ-i geJ 



so as long as 



J_ < d 2 ,j= c di,j'=2 
d\ ~ d\d 2 



there is enough weight available in Z2 to match all the weight of the choice c2 for D\ with a choice of 
the form cd for D2 and hence assigning the same colour, c, to site j in any pair of colourings drawn 
from the coupling. If there is not enough unassigned weight available in Z 2 then the coupling will 
match at much weight as possible, ; with choices of the form cd for Z2 but the remaining 

weight will be matched with choices not assigning colour c to site j in Hence we obtain the 
following probabilities conditioned on making a choice of the form c2 for D\ 

n / / / / I / oA , ( r, -, d2,j= c dl j'=2\ 

F Hx>,yi)e>!> k (x,y){Xj T Vj I x = c2) < max 10, 1 — J 

fn 1 d l j/=2 (d l!j= 2 - 1)\ 

< max 0, 1 — — — j 

V d ljj=2 d 1J/= 2 J 

1 

< 



di, j= 2 

using the bounds on d2 and dij =c , and as before we also have 



VHx',y>)e* k (x,yM' + Vj' \x' = c2) = l and Y^^Hx' , y ')^ k { x , y ){x' = c2) = 2 

«lj=2 ' u l,j'=2 T "2 
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Using the conditional probabilities and the bounds on d 2 , d\j =2 and d\y =2 we find 
Pi,j = , ^ { Pr (x'y)e* fc (^s/)( x i ^ Vj)} 

= max { y2~P*(x',y>)ey k (x,y)( x j ^ ^ I x ' = 2c ) Pl (x',y')^ k (x,y)(x' = 2c) 

(x,y)€Si 

+ X) Pr ( a: 'y)e*fc(s,2/)( a; j- ^ I X ' = c2 ) Px (x',y')^ k {x,y){x' = c2)| 

1 



< max < y 

(x,y)es t 



¥*{x>,y')^ k (x,y){x' = 2c) + Pr ( ^y ) g * fe j,) (x' = c2)- 



(*,10€S 4 | di , i=2 + d 1>jl=2 + d 2 ^ ^ ' y >^ k ^> d ljj=2 

dl,j=2 , ^1J'=2 1 



< max . 
(x,y)eSi I dij =2 (l + dij'=2) d\ j =2 + di,j' =2 + a 2 dij= 2 

1 <^i,i'=2 1 

< max < h 



1 1 

< ; „ + 



A + 2 ' (g-A + 1) 2 
and again by symmetry 

Pi,f = , m f* { P W)e* fc (^2/)(4 + Vj'} 

\X,y)izDi 

= / m f X c { E Pr (^y)6* fc (x,j/)(^'' / I x' = 2c ) Pr (x',y')^ k (x,y)(x' = 2c) 

+ Y1 Y>X ( X ' >v')^ k{x,y)( x 'j' / Vj' I = c 2)Pr(x'y)e*fc(^,y)( x ' = c2 )} 



< max < > 

(x,w)eSi V 



Pr (s', 2 /')e*fc(2^)( a;/ = 2c) + Pr (a; / i2/ /) e $ A: ( a . )2/ )(a; / = c2) 



dij'=2 



- / m ? x ^ 1 ~7 — —r + ^2 Pl (x',y')^ k (x,y)( x ' = c2 ) , 

(x, y )eSi [dx >j=2 + dij/ =2 + d 2 ^ y ' y > k( ' y > d 1JI=2 

, \ d\jt=2 dij=2 
< max < — - — + 



(x,y)eSi { dij =2 {l + di yj ' =2 ) dij=2 + dij'=2 + d 2 dij> =2 

1 dij =2 1 

< max < h 



(x,y)eSi [1 + dij= 2 dij =2 (l + di t f= 2 ) dij> =2 



1 1 

< : t + 



g-A + 1 (g-A + 1) 2 

which implies the statement of the lemma. □ 

This completes the cases of the coupling and we combine the obtained bounds on pfj and ■, 
in the following corollary of Lemmas [14j [15j [16] and [17] which we use in establishing the mixing time 

Of -Medge- 
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Corollary 18. Let j and j' be the endpoints of an edge 0fc. Ifihj} G E(G) and {i,f} G E(G) 
then 

k 1 1 k 1 1 

PiJ ~ -q~ A + ( g - A) 2 and " ^~A + (g-A) 2 " 

Remark. Note that the bound in Corollary [18] is never tight. This bound could be improved, 
however this would only allow us to beat the 2A bound for special graphs since the bounds in 
Lemma ITOl are tight. 

We are now ready to present a proof of Theorem [3] 

Theorem 03 Let G be a graph with maximum vertex-degree A. If q > 2A then 

Mix(7W edge ,e) < A 2 log(n£- 1 ). 

Proof. Let j and j' be the endpoints of an edge represented by a block 6^. Let ay = Y^iP^j be 
the influence on site j and ay = YliPij' then influence on j' . Then a = max(ay, ay). Suppose 
that 6^ is adjacent to t triangles, that is there are t sites i\, . . . ,it such that {i,j} G E(G) and 
{i, j'} £ E{G) for each i G ■ ■ ■ , it}- Note that < t < A — 1. There are at most A — 1 — t sites 
adjacent to j that are not adjacent to j' and at most A — 1 — t sites adjacent to j' that are not 
adjacent to j. Prom Lemma [TO] a site adjacent only to j will emit an influence of at most on 
site j and Lemma [TO] also guarantees that a site only adjacent to j' can emit an influence at most 
(g-Ap 0n S ^ e 3' Corollary [18] says that a site adjacent to both j and j' can emit an influence of at 

mos t ^ta + ( g -A) a 011 S ^ e anc ^ hence 

«i < * (V^X + ZT^W) +(A-l-t)f T ^ r V(A-l-t) 



A (g-A)V '\q-AJ v y V(9- A ) 2 



A-f A-l 

+ 



<7 - A ' (g - A) 2 
and similarly by considering the influence on site j' we find that 

A-l A-l 
< r + 



g-A ' (g-A) 2 " 

Then using our assumption that q > 2A we have 

, A-l A-l A-l A-l A 2 -l 1 
a = ma X (a J1 « / )<^ + ^< T + ^ r = ^ = l-^<l 

and we obtain the stated bound on the mixing time by applying Theorem [2] □ 



4 Application: Colouring a Tree 

This section contains the proof of Theorem [4] which improves the least number of colours required 
for mixing of systematic scan on a tree for individual values of A. Recall the definition of the 
systematic scan -M tre e where the set of blocks is defined as follows. Let the block contain a 
site r along with all sites below r in the tree that are at most h — 1 edges away from r. We call 
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h the height of the blocks and h is defined for each A in Table [TJ The set of blocks 6 covers the 
sites of the tree and we construct O such that no block has height less than h. is the transition 
matrix for performing a heat-bath move on block @k and hence P^ (x) is the uniform distribution 
on the set of configurations that agree with x off 0^ and where no edge incident to a site in 0^ 
is monochromatic. The transition matrix of the Markov chain J^itvee is 

n^ =1 p[ fc ] w here m is the 

number of blocks. 

We will use standard terminology when discussing the structure of the tree. In particular will 
say that a site i is a descendant of a site j (or j is a predecessor of i) if j is on the simple path from 
the root of the tree to i. We will call a site j a child of a site i (or i is the parent of j) if % and j are 
adjacent and j is a descendant of i. Finally Nk(j) = {i £ <90fc | i is a descendant of j} is the set of 
descendants of j on the boundary of 

Let (x,y) £ Si where i is on the boundary of some block The following lemma will provide 
upper bounds on the probability of disagreement at any site in the block. 

Lemma 19. Let (x,y) £ Si and suppose that i is adjacent to exactly one site in a block Then 
there exists a coupling ij) of D\ = P^ k \x) and D2 = P^ k \y) in which 

Pv^^ix'j + y'j) < - _ 2 )d( . 3 . } 

for all j £ @k where d(i,j) is the edge distance from i to j. 

Proof. We construct a coupling ip of D\ and D2 based on the recursive coupling defined in Goldberg 
et al. [10]. The following definitions are based on Figure [12l Let R C V be a set of sites. Also 
let (X, X') be a pair of colourings of the sites on the boundary of R (recall that the boundary of 
R is the set of sites that are not included in R but are adjacent to some site in R) which use the 
same colour for every site, except for one site u which is coloured I in X and I' in X' . We then say 
that A(R, (X,X'),u, (1,1')) is a boundary pair. For a boundary pair A(R, (X,X'),u, (1,1')) we let 
v £ R be the site in R that is adjacent to u. We think of v as the root of R and note that we may 
need to turn the original tree "upside down" in order to achieve this, however the meaning should 
be clear. We then label the children (in R) of v as v%, . . . , and let T = {R±, ■ ■ ■ , Rd} be set the 
of d subtrees of R that do not contain site v, that is for Rk £ T we define Rk = {j ' £ R \ J ' = 
Vk or j is a descendant of Vk}. Finally let D and D' be the uniform distributions on colourings of 
R consistent with the boundary colourings X and X' respectively and let D (v) (respectively D'(v)) 
be the uniform distribution on the color at site v induced by D (respectively D'). Then ^> r is the 
recursive coupling of D and D' summarised as follows. 

1. If I = I' then the distributions D and D' are the same and we use the identity coupling, in 
which the same colouring is used in both copies. Otherwise we couple D(v) and D'(v) greedily 
to maximise the probability of assigning the same colour to site v in both distributions. If R 
consists of a single site then this completes the coupling. 

2. Suppose that the pair of colours (c, d) were drawn for v in the coupling from step 1. For each 
subtree R' £ {Ri, ■ ■ ■ Rd} we have a well defined boundary pair A(R',(Xjir,X' R ,),v,(c,c J )) 
where Xri is the boundary colouring X restricted to the sites on the boundary of R' . For 
each pair of colours (c, c') and R! £ T we recursively construct a coupling VI/r'(c, c') of the 
distributions induced by the boundary pair A(R! , (Xri ,X' RI ),v, (c, c')). 
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Figure 12: The region denned in a boundary pair and the construction of the subtrees. 




Initially we let the boundary pair be A(R = 0^, (X = x,Y = y),u = i, (I = x^V = y^)) and our 
coupling ip of D\ and D2 is thus the recursive coupling constructed above. 

We prove the statement of the lemma by induction on d(i,j). The base case is d(i,j) = 1. 
Applying Lemma 13 from Goldberg et al. [H)] we can upper bound the probability of x'j 7^ y'j where 
(x',y') is drawn from tp by assigning the worst possible colouring to neighbours of j in 0^. Site j 
has at most A — 1 neighbours (other than i) so there are at least q — A colours available for j in 
both distributions. There is also at most one colour which is valid for j in x but not in y (and vice 
versa) so 

Pr^^^ix'j^y'j) < j—^. 

Now let R' be the subtree of 0^ containing site j and let v be the site in 0^ adjacent to i. Assume 
that for d(v,j) = d(i,j) — 1 
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Now for (x,y) £ Si 

Pr(x'y)e44 ^ ^j) = Pr (^y)e*e fc ( x 'j ^ tij) 



c,c 
c^c' 



< 



c^c' 

1 



where the first inequality is the inductive hypothesis and the last is a consequence of the base 
case. □ 

We will now use the coupling from LemmafT9lto define the coupling ^(x, y) of the distributions 
P^ k \x) and P^ k \y) for (x,y) £ If i £ dQ^ then it is adjacent to exactly one site in 0^ and we 
use the coupling from Lemma [T9l If i d&k then the distributions -P^(x) and P^ k \y) are the same 
since we are using heat-bath updates and so we can use the identity coupling. We summarise the 
bounds on p k ■ in the following corollary of Lemma IT9l which we will use in the proof of Theorem [4] 

Corollary 20. Let d(i,j) denote the number of edges between i and j. Then for j £ &k 



if i £ de h 

otherwise. 



Theorem [H Let G be a tree with maximum vertex-degree A. If q > /(A) where /(A) is specified 
in Table\T\for small A then 

Mix(7W t ree,e) = 0(log(ne- 1 )). 

Proof. We will use Theorem [2] and assign a weight to each site i such that Wi = £ dl where d{ is the 
edge distance from i to the root and £ is defined in Tabled] for each A. For a block 0^ and j £ @k 
we let 

Ei w iPi,j 

atk,j = 

Wj 

denote the total weighted influence on site j when updating block 0^. For each block and each 
site j £ 0^ we will upper bound a^j and hence obtain an upper bound on a = maxj. maxj g e fe a^j. 
Note from Corollary [20] that • = when i £ 0^ so we only need to bound p* • for i £ <90fc. 

We first consider a block 0^ that does not contain the root. The following labels refer to Figure 
[T3l in which a solid line is an edge and a dotted line denotes the existence of a simple path between 
two sites. Let p £ dQk be the predecessor of all sites in 0^ and d r — 1 be the distance from p to 
the root of the tree i.e., w p = £ rfr_1 . The site r £ is a child of p. Now consider a site j £ @k 
which has distance d to r, hence Wj = ^ d+dr and d(j,p) = d + 1. From Corollary [20] it then follows 
that the weighted influence of p on j when updating 0^ is at most 

k w z< 1 = 1 1 

Pp ' J Wj - (q - A) d ti>P) C dr+d (q ~ A) d+1 f ' 
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Figure 13: A block in the tree. A solid line indicates an edge and a dotted line the existence of a 
path. 

Level: d r — 1 




Level: d r 



Level: d r + d — I 
Level: d r + d — I + 1 



Level: d r + d 



Level: d r + h — 1 



Level: d r + h 



Now consider some site u G Nk(j) which is on the boundary of Ok- Since u G N)-(j) it has weight 
w u = £ dr+h and so d(j, u) = h — d. Hence Corollary [20] says that the weighted influence of u on j 
is at most 

k Wu 1 1 

Pv " 



?/' ; 



7^ 



(g - A) d 0» C dr+d (q - A) h ~ d 

Every site in has at most A — 1 children so the number of sites in Nk(j) is at most \Nk(j)\ < 
(A — \) h ~ d and so, summing over all sites u G Nh(j), the total weighted influence on j from sites 
in Nk(j) when updating 0^ is at most 



E 



< 



E 



(q - A) h ~ d 



(q - A) h ~ d 



The influence on j from sites in d&k \ (N^j) U {p}) will now be considered. These are the sites 
on the boundary of @k that are neither descendants or predecessors of j. For each site v between 
j and p, we will bound the influence on site j from sites b G A^(w) that contain v on the simple 
path between b and j. We call this the influence on j via v. Referring to Figure [13] let v G @k be a 
predecessor of j such that d(j, v) = I and observe that v is on level d r + d — I in the tree and also 
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that 1 < I < d since v is between p and j in the tree. If v is not the parent of j (that is I ^= 1) then 
let j' be the child of v which is also a predecessor of j, that is j' is on the simple path from v to j. 
If I = 1 we let / = j. Also let v' be any child of v other than f and observe that u' and j' are both 
on level d r + d— I + 1. Now let 6 G Nf.(v') be a descendant of t/ and note as before that = 
The distance between b and i/ is 

d(v', b) = d r + h- (d r + d-l + l) = h-d + l-l 

and so the number of descendants of v' is at most \Nk(v')\ < (A — tyh-d+l-l s \ nce eac h s ^ e h as a ^ 
most A — 1 children. Site v has at most A — 2 children other than j' so the number of sites on the 
boundary of G& that are descendants of v but not j' is at most 

|JV fc (t;) \ N k (f)\ < (A - 2)\N k (v')\ < (A - 2)(A - 

Finally the only simple path from b to j goes via v and the number of edges on this path is 

d(j, b) = d(j, v) + d(v, v') + d(v', b)=l + l + {h-d + l-l)=h-d + 2l 

so, using Corollary [20l the weighted influence of b on site j when updating block G& is at most 

ftp. < — L__ < « 



and summing over all descendants of v (other than descendants of j') on the boundary of G^ we 
find that the influence on j via site v is at most 

V n^< V ^ d < ^ (A-2)(A-l) fe -^- 1 

rb,j - ( a _ \)h-d+2l - ? _ A^-d+2J ' W 

&etf*(w)VV*(f') J beN k (v)\N k (f) w ' vy ; 

Summing j5]) over 1 < Z < d gives an upper bound on the the total weighted influence of sites in 
d@k \ (Nk(j) U {p}) on site j when updating G& 

V J> Wb < ^y (A-2)(A-l)^- 1 
™>i<w ^ (a - A) h - d + 21 

b£de k \(N k (j)U{p}) 3 i=i yH ' 

and adding the derived influences we find that the influence on site j (on level d r + d) when updating 
Gfc is at most 

TV ' ^ XV ' XV ' 

u€N k (j) 1 bede k \(N k (j)u{ P }) 3 

< 1 1 (A-l)*-' ^ „„ d ^ (A-2)(A-l)^- 1 

- ( 9 - A) d+1 £ d+1 (q — A) h ~ d 4j (g-A) ft - d + a 

Now consider the block containing the root of the tree, r. Let this be block Go and note that 
w r = 1. The only difference between Go and any other block is that r may have A children. There 
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Figure 14: The influence on site j via the root. A line denotes an edge and a dotted line the 
existence of a simple path. 




Level: 



Level: 1 



Level: d 



b Level: h 



are at most A(A — l) h 1 descendants of r in <90o, each of which has weight ^ h so, using Corollary 
1201 the weighted influence on the root is at most 

v o w b A(A-l)*- 1 h 

6GJV (r) V ^ 1 



Now consider a site j on level d 7^ in block 9o- As in the general case considered above there 
is an influence of at most 

^ pjjWb < (A-l) h - d h _ d 
^ w« ~ (a- A) h - d 

on j from the sites in No(j). Now consider the influence on site j from 8Qq \ No(j). We first 
consider the influence on j via r, which is shown in Figure fT4l Site r has at most A — 1 children 
other than the site j 1 which is the child of r that is on the path from r to j. Each child of r has 
at most (A — l) h ~ l descendants in <9€>o and each such descendant has distance h + d to j. Hence, 
from Corollary [20j the influence on j via the root is at most 

v ^< v e 1 < (A-i) fe 

^ to.- - ^ f d (o - A) d ( 6 '-?') _ f<7 — A) ft+d ' 
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Finally consider then influence on j from the remaining sites, which are in the set R = d@o \ 
(No(j) U (Nq(v) \ No(j'))). Again consider a site v / r G 0o where v is a predecessor of j and 
d(j,v) = I. In this case we have 1 < I < d — 1 since I = d is the root which has already been 
considered. This is the same situation as arose in the general case considered above (see Figure [T3]) 
so ([5]) is an upper bound on the influence on j via v and so summing j5|) over 1 < I < d — 1 and 
adding the other influences on j we obtain an upper bound on the total weighted influence on site 
j when updating block 0q 



a ,j 



< 



PbjW b v ^ p bjW b f^Wb 

t J yj ■ / ^ yj ■ < yj ■ 

beNoQj) 3 beN (r)\N (j') 3 beR 3 

(A-l) h -\ h _ d , (A-l) h ch _ d , ,_^(A-2)(A-1)^- 1 



We require a < 1 which we obtain by satisfying the system of inequalities given by setting 

otkj < 1 (6) 

for all blocks @k and sites j G 0^. In particular we need to find an assignment to £ and h that 
satisfies ([6j) given A and q. Table [1] shows the least number of colours /(A) required for mixing 
for small A along with a weight, £, that satisfies the system of equations and the required height 
of the blocks, h. These values were verified by checking the resulting 2h inequalities for each A 
using Mathematica. The least number of colours required for mixing in the single site setting is also 
included in the table for comparison. □ 



5 A Comparison of Influence Parameters 

We conclude with a discussion of our choice of influence parameter a denoting the maximum influ- 
ence on any site in the graph. As we will be comparing the condition a < 1 to the corresponding, 
but unweighted, conditions in Dyer et al. [5] and Weitz [17] we will let Wi = 1 for each site. Recall 
our definitions (letting u?j = 1) of pfj and a 

where ^k(x,y) is a coupling of the distributions P^- k \x) and P^ k \y). We have previously stated that 
this is not the standard way to define the influence of i on j since the coupling is directly included 
in the definition of p. It is worth pointing out, however, that the corresponding definition in Weitz 
|17| . which is also for block dynamics, also makes explicit use of the coupling. In the single site 
setting (Dyer et al. [5]) the influence of i on j, which we will denote pi 7 j, is defined by 

pij = maxdTv{pj(x),pj(y)) 

where Pj(x) is the distribution on spins at site j induced by P^(x). The corresponding condition is 
a = maxj Yl-iev Pi,i < 1- We- will show (Lemma l2~T1) that f>ij is a special case of pj j when <dj = {j} 
and ^j(x,y) is a coupling minimising the Hamming distance at site j. This will prove our claim 
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that our condition a < 1 is a generalisation of the single site condition a < 1. Before establishing 
this claim we discuss the need to include the coupling explicitly when working with block dynamics. 
Consider a pair of distinct sites j £ 0^ and f £ and a pair of configurations (x,y) £ Si. 
When updating block the dynamics needs to draw a pair of new configurations (x', y') from the 
distributions P^- k \x) and P^ k \y) as previously specified. Hence the interaction between j and j' has 
to be according to these distributions and so it is not possible to consider the influence of i on j and 
the influence of i on j' separately. In the context of our definition of p this means that the influence 
of i on j and the influence of i on j' have to be defined using the same coupling. This is to say 
that the coupling ^k(x,y) can only depend on the block and the initial pair of configurations x 
and y, which in turn specify which site is labeled i. It is important to note that the coupling can 
not depend on j, since otherwise having a small influence on a site would not imply rapid mixing 
of systematic scan (or indeed random update). The reason why we need to make this distinction 
when working with block dynamics but not the single site dynamics is that in the single site setting 
Pi j is the influence of site i on j when updating site j and hence whichever coupling is used must 
implicitly depend on j. Since the coupling can depend on j in the single site case it is natural to 
use the "optimal" coupling, which minimises the probability of having a discrepancy at site j. By 
definition of total variation distance, the probability of having a discrepancy at site j under the 
optimal coupling is d^y (p j (x), p,j(y)) = pij (see e.g. Aldous [T|). We will now show that p^j is a 
special case of p\ • in the way described above. 

Lemma 21. Suppose that for each site j € V we have a block @j = {j} and that = {0j}j=i,...,n- 
Also suppose that for each pair (x,y) G Si of configurations ^j(x,y) is a coupling of P^\x) and 
P^l (y) in which, for each c £ C , 

Pr(<y')e^0z,y)( x j = Vj = c ) = min ( Pr ^(x)(c),Pr H{y) (c)) 

where Pr^.^^c) is the probability of drawing colour c from distribution Pj(x). Then p\^ = pij. 

Proof. To see that ^fj(x,y) is a coupling of P^(x) and P^\y) it is sufficient to observe that 
Pr x'ePM(x)( x i = c) = Pr Mj ( x) (c) and similarly PtyepM^)^ = c) = Pr^ ^ (c) since j is the only 
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site in Qj. Thus we have 



max \^^{ x ', y ')^:j{x,y){ x 'j ^y'j)} 



(x,y)£Si 



max 

(x,y)£S., 



r (x',j/')G*j(^,y)( X i = Vj = c )) 



(<5j& j 1 " ^ min(Pr ^W (c) ' Pr ^(f) (c)) } 
JS. | E Pr M*)( c ) - min ( Pr M J w( c )' Pr M J (y)( c )) 



max 

(x,y)eSi 



max 



cec+ 



(c)-Pr w(tf )(c) 



r w (*)( c )- Pr w(w)( c )l 



max cItv(M'/( 



).Mj(2/)) 



where C + 



{c | Pr w(s) (c) > Pr w(l/) 



□ 



Finally we will show that the condition corresponding to a < 1 in Weitz 's paper |T7] does not 
imply rapid mixing of systematic scan. Let B(j) be the set of block indices that contain site j and 
b(j) the size of this set. Weitz refers to the sum YlkeBlj) Si Pi,j as ^he total influence on site j and 
the parameter representing the maximum influence on a site, which we denote a\y to distinguish it 
from our own definition of a, is defined as 



max 

3 



E 



E — 



We note that the the single site influence parameter a used in Dyer et al. |5] to prove rapid mixing 
of systematic scan is a special case of aw when the coupling from Lemma EH is used and each site 
is contained in exactly one block of size one. 

It is proved in Weitz [17] that the condition aw < 1 implies spatial mixing of a random update 
Markov chain and hence that the Gibbs measure is unique. We will now show that the parameters 
a and aw are different and in particular that the condition aw < 1 does not imply rapid mixing of 
systematic scan. To show this we exhibit a spin system for which a systematic scan Markov chain 
does not mix rapidly but aw < 1- It is sufficient to show that a specific systematic scan does not 
mix, since Theorem [2] states that any systematic scan with a specified set of blocks mixes when 
a < 1. 



Observation 22. There exists a spin system for which aw < 1 and a 
not mix. 



1 but systematic scan does 
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Consider the following spin system. Let G be the n- vertex cycle and label the sites 0, . . . , n — 1 
and C be the set of q spins. Then G» (which has an associated transition matrix PW) is the block 
containing site i and i + 1 mod n and it is updated as follows: 

1. The spin at site i is copied to site i + 

2. a spin is assigned to site i uniformly at random from the set of all spins. 

The stationary distribution, 7r, of the spin system is the uniform distribution on all configurations of 
G. Clearly pM satisfies property (1) of the update rule, namely that only sites within the block may 
change during the update. To see that 7r is invariant under each P^ observe that site i + 1 takes 
the spin of site i in the original configuration and site j receives a spin drawn uniformly at random. 
This ensures that each site has probability 1/q of having each spin and that they are independent. 

We define the p values for this spin system by using the following coupling. Consider a block 
Qj for update. The spin at site j + 1 is deterministic in both copies, and each copy selects the 
same colour for site j when drawing uniformly at random from C . First suppose that site j is the 
discrepancy between two configurations. Then, since the spin at j is copied to site i + 1, the spin 
of site i + 1 becomes a disagreement in the coupling and hence Pj - +1 = 1. The spin at j is drawn 

uniformly at random from C in both copies and coupled perfectly so /t^- ■ = 0. Now suppose that 

the two configurations differ at a site i ^ j. Then p\j + \ = since both configurations have the 

same colour for site j, and pj • = since the spins at site j are coupled perfectly. Using the values 
of p we deduce that 

and a = maxj maxj G fc ^ p\^ = 1. 

Let M-> be the systematic scan Markov chain that updates the blocks in the order Bo, 0i, . . . , @ n -i- 
For each block Gj note that if a configuration y is obtained from updating block Qj starting from 
x then yi + i = Xj. Hence when performing the systematic scan, the spin of site in the original 
configuration moves around the ring ending at site n— \ before the update of block G n _i moves it on 
to site 0. Hence if configuration x' is obtained from one complete scan starting from a configuration 
x we have x' = xq and the systematic scan Markov chain does not mix since site will always be 
assigned the same spin after each complete scan. 
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